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The starting point of this work is a framework ahowing to model systems with dynamic 
process creation, equipped with a procedure to detect symmetric executions (i.e., which differ 
only by the identities of processes). This allows to reduce the state space, potentially to an 
exponentially smaller size, and, because process identifiers are never reused, this also allows 
to reduce to finite size some infinite state spaces. However, in this approach, the procedure 
to detect symmetries does not allow for computationally efficient algorithms, mainly because 
each newly computed state has to be compared with every already reached state. 

In this paper, we propose a new approach to detect symmetries in this framework that 
will solve this problem, thus enabling for efficient algorithms. We formalise a canonical repre- 
sentation of states and identify a sufficient condition on the analysed model that guarantees 
that every symmetry can be detected. For the models that do not fall into this category, our 
approach is still correct but does not guarantee a maximal reduction of state space. 

1 Introduction 

The problem of detecting symmetries during the construction of state spaces has been widely 
explored |9j. Given a formalism and its operational semantics, the state space is built starting 
from an initial state. We repeat the computation of new states for every discovered state and 
we aggregate them in order to reach a fixed point, which is the set of all reachable states. To 
add abstraction to our formalism we can define equivalence classes of states, two states in the 
same class are said to be symmetric. Each time a state symmetric to an already visited one is 
found, it is considered as already known, also it means that if the abstraction was well defined, 
then the behavior resulting from further exploration of this state leads to an already analysed 
behaviour and thus may be omitted. The resulting state space corresponds to the quotient graph 
of equivalence classes of states (symmetric states) and can be exponentially smaller than the 
original graph [2], or more drastically infinite state spaces can be reduced to finite ones if the 
number of equivalence classes is finite. Reductions by symmetries have been implemented in a 
variety of tools and proved to be a successful technique [71 [m [1] . 

Here we focus on symmetries induced by dynamic process creation in multi-threaded com- 
puting systems. The goal of formalisms using this paradigm is to allow processes to be created at 
runtime. This ability contributes to the size explosion of state spaces, in particular the order of 
created processes, despite being irrelevant, introduces large amounts of interleaving. Moreover 
processes may be created endlessly which would lead to infinite state spaces if states cannot be 
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identified as equivalent. In the formalism we focus on, process names are irrelevant, it is the re- 
lations between them that matter and not their identities. For example, let us consider an http 
server, each time it receives a request for the home page of a website, it creates a new thread which 
will handle the request. Now suppose that the request has been handled but the server receives 
another one for the same home page, then it creates a new thread which will handle the new re- 
quest. These two threads have different identities however they answer the same kind of requests 
in same conditions, the resulting behaviour will be the same modulo thread identifiers, they are 
clearly symmetric. This example well illustrates the nature of symmetries we want to detect. 

The current work is based on the approach developed in [10^ \TT\ [6] which presents a method 
to detect symmetries introduced by process identifiers. The approach in these works reduces the 
detection of symmetries to computation of graph isomorphisms. Each newly discovered state is 
transformed into a graph, if this graph is isomorphic to any graph corresponding to a visited 
state then the two states are symmetric. The main problem with this approach is that each time 
we discover a state, we have to look for an isomorphism with all the visited states. 

In this paper we present an effective way to detect symmetries without computing a graph 
isomorphism. It provides representations for markings that allow direct comparison of states, 
with a hash table for example, instead of requiring a comparison with all visited states. Let 
us consider the following state space exploration algorithm, similar algorithms can be found in 
[U E]. We compute the state space of a model starting with an initial state in a set todo. For 
each state s in set todo, we check if the state has already been visited using a function visited. 
If the state was not visited yet then we add it to a set done and add all of its successor states (by 
calling for example a function succs(5')) to set todo. At the end of execution the set of reachable 
states is the set done. Symmetry detection can be performed in function visited. If we use the 
graph isomorphism approach, we have to check for an isomorphism with all states in set done, 
whereas using canonical representation we will just test if the state is in set done. The former 
approach is more expensive due to the involved loop and the isomorphism computation even if 
this operation may be fast with distinct graphs [13^ IT2] . With the latter we can perform the test 
in constant time using a hash table for example, indeed the representation being canonical we 
can build a hash function for table lookup and then use a few comparisons. 

The method presented here does not detect all symmetries whereas the graph isomorphism 
method does. However, we trust that we detect a large amount of symmetries in systems where 
completeness is not achieved and the trade-off between completeness and efficiency is interesting. 
Moreover, we provide a sufficient condition for maximal reductions, i.e., for completeness. 

This paper is structured as follows: first we introduce process identifiers and recall the 
definition of the Petri net models we use, next we introduce the data structure we will use 
to represent markings, then we present the theoretical basis for symmetry detection with our 
representation, and finally we discuss canonisation of the representation and a sufficient condition 
to achieve completeness. Due to space limitation, proofs have been omitted but a version of this 
paper including the proofs is available as a technical report [3]. 

2 Process identifiers and Petri nets 
2.1 Process identifiers 

Addressing systems with multiple threads or multiple processes is a tedious task. When the 
number of processes is fixed, each process can be represented as a subnet in the model. This 
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approach is limited, indeed the number of processes is fixed and thus no new processes can be 
created. In real life applications we create processes at runtime and so we need to reason about 
this dynamic process creation and the resulting behaviour. 

The problem of reasoning about this kind of systems has lead to the development of different 
formalisms and methods |15| . Here, we focus on an implementation in terms of Petri nets 
[8] which allows for dynamic process creation and is mainly based on f6]. First we refine the 
definition of process identifiers that will be used through this paper, the definition is compatible 
with the one from f6j. The only difference is the introduction of an empty pid that helps the 
formal definition of pids. Next, we also add some useful operations to handle pids. 

Definition 2.1 (process identifiers) Process identifiers (pids) are elements o/P = (N+)*, the 
set of tuples over non-zero natural numbers. P equipped with the tuples concatenation operation 
and its identity element, the empty tuple {), is a monoid. We denote ai.aj- ••• the process 
identifier {ai,a2, ■ ■ ■ ,a„) . 

Definition 2.2 Let n he a pid. We define the length, the prefix and the set of suhpids of n as: 

• length{7i) =n if3n>0 such that n = {ai,...,an) and otherwise; 

• prefix{7l) = {ai, . . . ,a„^i) if 3n > such that 71 = {a[, . . . ,a„) and i) otherwise; 

• subpid{Ti) = {k} U subpid{prefix{7i)) if length{7l) >0 and® otherwise. 

Now as in [6| we define the operations that can be used to compare process identifiers in a 
model. The following operations are the only ones that can be used in the model to compare 
pids, any other operation is forbidden. 

Definition 2.3 (operations on process identifiers) Two pids 7r,7r' gP can be compared us- 
ing the equality and the following operations: 

• K <i 7i' iff 3a G N+ such that n.a = k' ; 

• K < 7i' iff 3ai, . . . ,a„ gN^ such that n.a\.- ■ ■ .a„ = n' ; 

• 71 rtii tt' iff 37r"GP,3/GN+ such that n"^{),n = n".i and k' = k" .{i + \) ; 

• 71 rtl tt' iff 37r"GP,3/,7eN+ such that k" ^ {), K = 7i" .i, 7i' = 7i" .j and i < j . 

Intuitively, 7l<i7l' means that 71 is the parent of tt', 7i<7i' means that 7i is an ancestor of 
k' , 7l(h7l' means that 71 is a younger sibling of k' , i.e., was spawned before 71 and have the same 
parent as 71, finally 7r(tii 7r' means that 7r is the younger sibling of k' spawned just before k' . 

The last refinement on pids is the introduction of a total order between pids. This order will 
be important when addressing the detection of symmetries and canonical forms. 

Definition 2.4 (ordering) The set of all pids P is totally ordered with a hierarchical order, 
i.e., pids are ordered by length and if they have the same length, the lexicographic order on tuples 
is used. 



2.2 Coloured Petri nets 

The models we address are (coloured) Petri nets which allow dynamic process creation. A formal 
definition was given in |10^ [TT] then refined in [6] , we base ourselves on the refined one. Here, we 
recall the definition and some requirements. Let V be a set of variables, D a set of data values 
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and E a set of expressions such that VUD C E. We assume that E contains Boolean expressions. 
A binding is a partial function (5 : V— )• PUD. The application of a binding j8 is extended to 
denote j8 (e) , the evaluation of an expression e under jS . The evaluation of an expression under a 
binding is naturally extended to sets and multisets of expressions. Data values, variables, syntax 
for expressions, possibly typing rules etc. denote the colour domain of a Petri net. 

Definition 2.5 (Petri nets) A Petri net is a tuple {S,T,l) where S is a finite set of places, T, 
disjoint from S, is a finite set of transitions, and i is a labelling function such that: 

• for all s & S, l{s) is the type of s, a Cartesian product Xi x ■ ■ ■ x (k> I), where each 
is F or D, this type denotes the values that s may contain; 

• for all t £T , i{t) is the guard of t, i.e., a condition for its execution; 

• for all {x,y) G (S X T)[J {T X S) , i{x,y) is a multiset over E and defines the arc from x to y. 

A marking of a Petri net is a map that associates to each place s £ S a multiset of values from 
i{s) . We denote by Mrk the set of all markings. From a marking M, a transition t can be fired 
using a binding (5 and yield a new marking M' , which is denoted by M[t,fi)M' , iff: 

• there are enough tokens: for all s £ S, M{s) > I5{i{s,t)); 

• the guard is satisfied: ^{i{t)) is true; 

• place types are respected: for all s £ S, I5{i{t,s)) is a multiset over i{s); 

• M' is M with tokens consumed and produced according to the arcs: for all s £ S,M'{s) = 



The following requirements are adapted from [6] which generalised |10l [TT] in two ways: first 
by using transitions systems in general instead of Petri nets in particular; second, by relaxing 
the constraints. We have preferred to stay with Petri nets to extend our previous works Oil]. 
A Petri net respecting all the following requirements is called a thread Petri net (or t-net). 

1. The set of places of contains a unique generator place Srj having type P x N. The 
generator place stores tokens where / is the counter of child threads already spawned 
by 71. Thus the next threads created by TC will have pids 71. (/ + 1), 7r.(/ + 2), etc. We say 
that 71 is generative at a marking M if there is a « S N such that {7l,n) £ M{sn). 

2. We assume that the initial marking Mq of is such that the generator place contains 
exactly one token, ((1),0), and all the other places are empty or contain data values. 

3. For each transition t £T, the annotation on the arc from the generator place to f is a set 
of the form i{sn,t) = {(pi,ci), . . . , {pk,Ck)} where k>0 and all /7,'s and c,'s are distinct pid 
and counter variables. The annotation on the arc from t to the generator place is a set: 



where m<k, and nj > for all j. An empty arc annotation means that the arc is absent. 
Below we denote by Ylt the set of all newly created pids pi.{ci + j) used in i{t,Sr^). 



M{s)-ms,t))+mt,s)). 




if 

= < 



{Pl,Ci+ni),...,{pm,Cm + nm), 

(/7i.(ci + l),0),...,(pi.(ci+?ii),0) 
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4. For each transition t £T and each place s G 5'\{i'7j}, the annotation on the arc from s to 
f is a multiset of vectors built from variables and data values, and the annotation on the 
arc from f to 5 is a multiset of vectors built from expressions involving data variables and 
data values as well as elements from UtL){pi, . . . ,p,n}. 

5. For each transition t & T , i{t) is a computable Boolean expression, built from the variables 
occurring in the annotations of arcs adjacent to t and data values. The usage of pids is 
restricted to comparisons of the elements from TltU {pi, . . . ,pk} using the operators from 
{=,<!, <,rtii,fti} 

For each reachable marking M we define the corresponding state qm = (Om,T]m) given by 
Om = {{^^^{s)) I s G 5'\{s,j}} and T]m = {K^k\ {K,k) £ M{sn)}. Given a state qM we define the 
following notions: 

• for each generative pid 71 in qM, i.e., 71 G dom{r\M)-, the next pid to be created is given by 
nextq^{K) = 7i.{riMi7i) + 1); 

• the next-pids of qM are nextpid^^ = {nextqi^j{K) \ K G dom{riM)}] 

• pidq^ is the set of all pids involved in Om, i.e., pids from s-q and all data places. 

Definition 2.6 (state equivalence) Two states qM and qM' are equivalent if there is a hi- 
jection h : (pid^^ Unextpidg^) — )• {pidq^, Unextpidg^,) such that for all relations -< G {<i,<} and 
X G {rtii,rti}.- 

1. h{dom{ri,j„))=dom{riq^,); 

2. Mk G dom{r]q„),h{nextq^{K)) = nextg^,{h{7i)); 

3. \/K,7l' G pidy^ iff h{%) < h{%'); 

4- Vtt, 7i' G pidg^ U nextpidq^ -.TiXTi' iff h{n) X h{n'); 
5. Oq^i is Oq^ after replacing each pid n by h{n). 
We denote this by qM qM' , or simply qM ~ qM'- 

Two reachable markings M and M' of a Petri net are equivalent if and only if their respective 
states qM and qM' are equivalent. This is denoted by M ~/j M' or simply M ~ M' . 

This equivalence relation ensures that markings contain the same data tokens and pids are 
related through h. State (or context) equivalence guarantees that h preserves the relations 
between pids for <i, <, fhi and rtl. So if two markings are equivalent then they differ in pids but 
these pids have the same relations among them and thus differ only in names; names being not 
relevant the states can be assimilated. 

Theorem 2.1 Let M and M' be h-equivalent^reachablejmarkings of a t-net N , and t 6eji tran- 
sition such that M[t,P)M. Then M'[t,ho I5)M' , where M' is a marking such that M ^j^M' for a 

bijection h coinciding with h on the intersection of their domains. 

As stated in \10\ [TTl [6] the above result captures a truly strong notion of marking similar- 
ity. So, if two markings are equivalent their futures are the same modulo pid renaming. To 
compute state equivalence, the approach proposed in [lOl [lU E] was first to build a graph then 
compute graph isomorphisms. Computing these isomorphisms is an expensive step and was 
hardly conceivable in a tool since we have to compute it for all visited states each time a new 
state is discovered. The contribution of this paper is to provide a representation for markings 
to detect this equivalence. This is presented in following sections but before moving on we add 
some notations for tokens. 
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Definition 2.7 (owned, shared, active, and referenced pids) Let v = {x[,...,x„) be a to- 
ken. If xi gP then xi is the owner ofv, otherwise if xi we say that v is shared. We denote 
by pid{v) the owner of v if defined. If a pid 7i appears in a token v at a marking M then we say 
that 71 is active at M. Finally we define the set of referenced pids of v as the set of all pids 
appearing in v except pid(y) . 



3 Pid-trees 



A pid-tree is a representation of a marking wiiere tokens are associated to process identifiers. 
The intention is to classify them by ownership. The root of the tree will contain data tokens 
that are shared between different processes and each of the remaining nodes will contain tokens 
that belong to a particular process. To identify these processes each child node is prefixed with 
a fragment of pid. Each path in the tree leads to a node that contains a marking where all 
tokens belong to the process which pid is the concatenation of all the fragment pids along the 
path; we say that this pid labels the path. A path can be seen as a concatenation of pids that 
lead to a, possibly empty, marking. A formal definition of paths is given later in this section 
and ownership constraints will be added when exposing the construction of the tree. Pid-trees 
are formally defined by the following definition. 

Definition 3.1 (pid-tree) The set S of pid-trees is defined by {M,C) ^ E if M £ Mrk and 

3n £N .such that C = {{ai,ti),. . . , {an,tn)) G (P x S)" and C satisfies: 

• V/ G {!,..., «}, ai / 0; 

• V/, J G {1, . . . ,«} if i ^ j then Ui ^ Uj, at ^ subpid{a f) and a j <^ subpid{ai) . 

We denote M } (^ti,. the pid-tree {M, {{a\,t]) . . , {a„,tn))) ■ We will also use the nota- 

tion Mf to denote the marking M in t = {M,C). 

Having tuples in the definition of children nodes allows to perform a syntactical ordering of 



them. The conditions in the second point of definition 3.1 ensure that each pid associated with a 
node is unique, but more than that,^it ensures that we cannot have two children prefixed by the 
same pid fragment. For example — — ^2) is an illegal pid-tree as well as — — — >{t\,t2) since 
1 € subpid{\ .2) . A pid-tree can be represented graphically, see Figure [T| each node is labelled 



with a marking and each arc with a pid fragment. In Figure 1(a) , the process of pid 1 does not 
own any tokens so the corresponding node is associated to an empty marking, but the process 
of pid 1.2.1.3 owns some tokens so the corresponding node is associated with marking M3. 

To check the inclusion of pid-trees we have to start from the root, first we check the inclusion 
of markings and then we check the inclusion of children that must be labelled with the same 
pid fragments. It is important to notice that the inclusion operation on pid-trees preserves the 
root, respects marking inclusion and does not denote a subtree relation. The formal definition 
follows. 

Definition 3.2 (inclusion) Let t = {M , {{ai,ti) , . . . , {an,tn))) G'E be a pid-tree. A pid-tree t' = 
{M' ,{{a[,t[) , . . . ,{a'^,t'^))) £ Z, withm<n, is included in t , noted t' (It, if M' <M and for all 
/ G { 1 , . . . , m} , there is a j G { 1 , . . . , «} such that a'- = aj and t- C tj . 



For example, the pid-trees in Figure 1(b) and 1(c) are included in the pid-tree in Figure 1(a) 



if M'l < Ml and M'^ < M4. The pid-tree in Figure 1(d) is not included is any other pid-trees in 
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2.1,1 



1,2 



>(0^M3,M2-^(M4,0))) is shown in Figure |l(a)[ the 



Figure 1: The pid-tree Mj 
pid-tree M\ — ^0 — >-0 — '—^{M'^,®) is shown in Figure 1(b) Figures 1(c) and 1(d) show two pid- 
trees that represents path{l.l.l,M'^) and are respectively — )-0 — >(d — >M'^ and — '—^M'^. We 
assume that markings M, and M[ are not empty. 



Figure [TJ The pid-tree — 7-0 is included in pid-trees in Figures 1(a), 1(b) and 1(c) whereas 
— > is not included in any of pid-trees in Figure [T| which is straightforward to understand if 
we remember that we have to preserve the root. 

The definition of subtrees is similar to the usual definition on trees, however we add a 
localisation hint for the subtree in order to differentiate two subtrees that are structurally equal 
but at different positions in the tree. Thus a subtree will be a pair formed of a pid and a pid-tree. 



if: 



a pid-tree. A pid-tree t is a 



Definition 3.3 (subtrees) Let to = M - 

subtree of Iq at n, noted {K,t) G Trees{tQ). 

• 71 = {) and t = to; or 

• 3ai £ {«!,...,«„} such that K = ai.n' and {k' ,t) G Trees{ti). 
We denote Trees{t) the set of all subtrees of t. 

Using the definition of subtrees, we can retrieve the set of all pids in a pid-tree which 
corresponds to all the localisations in the pid-tree. 

Definition 3.4 Let t £E be a pid-tree. We define pid{t) = { n \ {n,t') £ Trees{t) }. 

Now that we have set up some basic blocks to handle pid-trees, we give a formal definition 
of paths. A path is a linear pid-tree with all markings empty except for the leaf one. 

Definition 3.5 (paths) Let tt € P 6e a pid, and M £ Mrk a marking. A path labeled by K and 
decorated by M is a pid-tree t such that {k} Qpid{t) C subpid{K), {k,{M,{))) G Trees{t) and all 
other markings in t are empty. We denote path{K) the path path{n,%) and paths{T) the set of all 
paths in a pid-tree T (with respect to Cj. 

As the definition suggests, there may be different pid-tree representations of a path, indeed 
these representations will differ in the length of the path and the pid fragments on arcs. For 



example the pid-trees on Figures 1(c) and 1(d) represents the same path path{l.l.\,M'^). 

A path denotes an ownership relation between a process and some tokens. Intuitively 
path{K,M) represents the fact that all tokens that appear in M belong to the process of pid 
K. We say that K labels this path. Moreover saying that a subtree f is at tt means that the path 
path(7i) leads to t. 
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When addressing process identifiers, their names are irrelevant, so we introduce an opaque 
description for locaHsations in pid-trees. We could use pids to locate nodes but we need to 
abstract them so we use relative paths which describe the localisation of a subtree independently 
of the labels appearing on the edges. A relative path can be seen as a normalized form of a pid 
based on its position inside the tree. 

Definition 3.6 (relative patlis) Lett£E be a pid-tree. A relative path path{7i,t) ofK£pid{t) 
is the tuple defined as: 

• (/) if 71 = at and t = M ) (^ti ,...,?„) where 1 < / < n; 

• . . . ,i,n) if7l = ai.7l',t=M > (fi,. and {ii, . . . ,im) =path{7l' ,ti) where I <i <n. 

Finally, since we have the ability to consider the ordering of children of a node, we introduce 
labelling functions that will be used to compare and order these children. 

Definition 3.7 Let (£",<) be a partially ordered set. A labelling function i is a partial function 
from P X E to E. 

Definition 3.8 Let t gZ be a pid-tree and i a labelling function. The pid-tree t is ordered with 
respect to I if: 

• t = {M, {)) or 

• t=M — l{auti) < ■ ■ ■ < l{an,tn) andti,..., tfi are ordered with respect to £. 

4 Checking marking equivalence 

To apply theorem |2.1| and reduce state spaces we will need to detect marking equivalence. We 
check marking equivalence by mapping markings to pid-trees and comparing these trees. If the 
trees are equivalent, markings will be isomorphic. However the representation we use is tightly 
linked to the relations used to compare pids in the models. More precisely the representation 
we use embeds the relations <i, <, ftli and rtl. 

Intuitively a pid-tree embeds the relation < by construction because it is a hierarchy of pids. 
In this section we need to include the relation rtl into pid-trees as well as <i and ftli. To do so, 
we will use a specific labelling and ordering on child nodes. 

Definition 4.1 (sibling ordering) Let f € S be a pid-tree. The pid-tree t is sibling ordered if 
it is ordered with the labelling function £ : P x E — t- P defined as: 

£{ai, t') = Ui 

Where P is equipped with the hierarchical order on pids. 

Using the sibling ordering we can define the representation of markings that will be used 
to detect symmetries. The representation is described in terms of three rules that have to be 
obeyed when building the pid-tree. The resulting pid-tree is not unique, as shown in Figure [2j 
but the definition gives all the theoretical requirements to guarantee the marking equivalence. 

Definition 4.2 (pid-tree representation of markings) Let M be a reachable marking of a 
t-net N. We build a representation of M as a sibling ordered pid-tree and repr{M) denotes the 
set of all such representations. R{M) G repr{M) if for each place s G N which type is Xix ■■■ x X„, 
and for each token v = {xi,. . . ,Xn) G M(s), for Xj G {D,P}, we apply exactly one of the following 
rules: 
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• generator token rule: ifs = Sn andv = {7t,i) then we have path{7t) C. R{M) andpath{K.{i + 
1)) C/?(M), i.e., path{nextq„{n)) CR{M). 

• shared token rule: ifX\ =D then we have path{{),{{s,v)}) CR[M), and for each Xj such 
that Xi = F we have path{xi) C R[M). 

• owned token rule: if Xi =P then we have path(xi,{{s,v)}) C/?(M), and for each Xi such 
that Xi = F we have path{xi) C R{M). 




6 {^2^ {(1,1,2)}} 
O 

(a) 



9 {51 ^{12}} 
1 

{^2^{(1)}} 

1.2 

{^2^ {(1,1,2)} 

1 

6{0} 



(b) 




Figure 2: Two sibling ordered pid-trees representing the marking M = {s\ ^ {12},>S2 ^ 
{(1), (1, 1,2)},^^ ^ {(1, 1), ((1, 1,2),0)}} with i{si) = N, i{s2) = P and i{sr^) = P x N. 

Checking the equality on sibling ordered pid-trees will not offer any advantages over compar- 
ing markings for symmetry detection. To capture symmetries we need to abstract pids because 
their values are irrelevant and only the relations between pids matter. This leads to the definition 
of an equivalence relation of sibling ordered pid-trees. 

Definition 4.3 (sibling ordered pid-tree equivalence) Let T and T' he two sibling ordered 
pid-trees, T and T' are equivalent if: 

1. the function h : pid(T) pid(T') such that h = { {7t,7t') \ K&pid{T), n' &pid{T') and 
path{K,T) = pathijl' ,T') } is a bijection 

2. for each pair of subtrees {71, t) G Tr/eesj^J') and {ti' ,t') £ TreesiT') such that n' = h{Tl) and 
t=M, ) {ti,...,t„), t' = Mt> """" "' > {t[,...,t'^,) we have: 

(a) trees have the same structure, i.e., n = n' , which is implied by the definition of h. 

(b) for each place s different from Srj, Mti{s) can be obtained from Mt{s) by replacing each 
pid K occurring in the tuples of Mt{s) by h{K); 

(c) for each ai we have 

• length{ai) = length{a\) = 1; or 

• length{ai) > 1 and length{a[) > 1. 

(d) for each a, and such that prejix{ai) = prefix{aij^\) and \ < i < n — \ we have 
prefix(a'i) = prefix{a'i_^_^) and 

• at+i - ai = a|_^i - a- = 1; or 

• a,+i — a/ > 1 and a-_,_j — a'- > 1 . 

We will denote this by T T' or simply by T ^ T' . 
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The function h in the definition above is a bijection if and only if two pid-trees have the 
same structure because pids are related through h if and only if their relative paths are equal. 
The second point states first that trees have the same structure, the markings only differ in pids 
and these pids are related through h, then checks condition on pid fragments. Length conditions 
ensure that <i and < are preserved, indeed if the length of a fragment is equal to one then the 
parent and the child are in <i otherwise the parent and the child are in <. Offset conditions 
for fragments with same prefix ensure that rtli and rtl are preserved. If prefixes are equal then 
the children are siblings. If offsets are equal to one then the corresponding pids are in rtii and if 
offsets are greater than one the pids are in rtl. 

It has to be stressed that we cannot use length^a/) = length{a'i) instead of length{ai) > 1 and 
length{a'i) > 1 because the the number of inactive pids between them and their parents does 
not matter as long as < is preserved. We qualify a pid as inactive if it does not appear in the 
marking (including the generator place) used to build the tree. The same remark applies for 
offsets and rtl. 

The equivalence is illustrated in Figure [3j Pid-trees in this figure are valid representations 
of markings, however we do not explain here how we choose the pid-tree we want among valid 
representations, this is discussed in section [sj Pid-trees Ti (Figure 3(a)) and T2 (Figure [3(b) ) 



are equivalent if = {1 1, 1.1 1.4, 1.3^^1.7, 1.2.1^^1.3.1, 1.2.2^^1.3.2, 1.1.1 1.4.2} 

satisfies marking, length and prefix conditions. Marking conditions are satisfied: h{{l)) = (1) 
and/j((l.l, 1.2.1, 1.2.2)) = (1.4, 1.3.1, 1.3.2). Prefix and length conditions are satisfied as well: 

we have prefix{l) = prefix{3) = (), prefix{4) = prefixij) = (), we check offsets: 3 — 1 > 1 and 
7-4> 1. 



we have pre^ (2.1) =prefix{2.2) = (2) , prefix{3.l) =prefix{3.2) = (3), we check offsets: 2—1 
2-1 = 1. 



Pid-trees Tj, (Figure 3(c) ) and T4 (Figure [3 (d)P are not equivalent with others in Figure [3] despite 



they have the same structure, prefix conditions can be checked to convince ourselves. 

Intuitively the length conditions guaranties that <i is preserved, and the prefix conditions 
guaranties that the distances between pids with respect to relation rhi are correct and thus 
replaced pids have same relations with their siblings and ancestors. 

Now we will see that we can define a restriction of h to pid^^Unextpid^^ that is also a bijection 
and that we can extract the original marking from the pid-tree representation. This is denoted 
by the following two propositions. First we bound the pids appearing in a representation, then 
we state that different restrictions are bijections as well. 

Proposition 4.1 Let M be a reachable marking of a t-net, qM its state and R{M) its represen- 
tation. Then 

pidq^ Unextpidy^ C pid{R{M)) C subpid^^ Unextpid^^ 
Where subpid^^ = {subpidijl) \ K G pid^j^}. 

Proposition 4.2 Let M and M' be two reachable markings of a t-net, qM,qM' their respective 
states and R{M),R{M') their representations. If R{M) r^i, R{M') then: 

1. the restriction hi : pid^^ — )• pid^^, of h is a bijection; 

2. the restriction h2 '. nextpid^^ — )■ nextpid^^, of h is a bijection. 

3. the restriction : pid^^ U nextpid^^ — )• pidg^, U nextpid^^, of h is a bijection. 
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(a) Pid-tree Ti , a representation of 
MiUM2U{^^i^ {(1,2), (1.1,0)}}. 




(b) Pid-tree T2 representation of 
Ml UM3 U {^r, {(1,6), (1.4,1)}}. 




(c) Pid-tree representation of Mi U 
M4U{.,,^ {(1,4), (1.4,1)}}. 




(d) Pid-tree T4 representation of 
MiUM5U{i-r,^ {(1,5), (1.4,1)}}. 



Figure 3: Four sibling ordered pid-trees where Mi = {si 1— )• {(1)}}, = {s2 ^ {(1.1,2.1,2.2)}}, 
M3 = {S2 ^ {(1.4,3.1,3.2)}}, M4 = {^2 ^ {(1.4,3.1,3.2)}} and M5 = {^2 ^ {(1.4,3.1,3.3)}}. 



The fohowing proposition hnks aU tokens in marking (except tokens from ) with tokens in 
the tree. 

Proposition 4.3 Let M be a reachable marking of a t-net N. Then for each place s in N such 
that s 7^ we have: 

M{s)=i U M,{s)\ 

\{7l,t)eTrees(R{M)) J 

The next theorem states that two equivalent representations of markings imply an equivalence 
of markings. It leads to potential reduction in state spaces due to symmetries considering 
theorem 12.11 

Theorem 4.1 Let Mi and M2 be two reachable markings of a t-net. If R(Mi) and R{M2) are 
equivalent then Mi ~ M2. 

It is worth noting that at this stage, markings do not have a unique representation. However 
in the following section we will introduce a sufficient condition to detect all symmetries within 
a system. 
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5 Discussing canonisation 

As shown in previous section, there are different representations of a marking. These represen- 
tations differ in the number of paths they contain. But as shown in proposition 4.1 the number 
of paths in a pid-tree can be bounded by the following equation: 

pid^^ Unextpidy^ Qpid{R{M)) C subpid^^Unextpid^^ 

In next two subsections we will explore two canonical representations of sibling ordered pid- 
trees. The first corresponding to the upper bound of the representation and the second one to 
the lower bound. 



5.1 Expanded pid-tree representation 



At this point we have a representation of markings as sibling ordered pid-trees and a pid-tree 
equivalence relation that implies an equivalence of markings which lead to the detection of 
symmetries. However using the rules we provide one can build different pid-trees that respect 



the definition 4.2, indeed the definition of a path (definition 3.5) allows to strip or expand 



branches inside the tree. 

Definition 5.1 (expanded pid-trees) A pid-tree T £E is expanded if for all {7l,t) in Trees{T) 
such that t = M — " > {t\,. . . ,tn) we have length{ai) = 1 for I <i <n. 

In an expanded pid-tree all pids associated to children nodes are of length 1 . Intuitively we 
can see an expanded pid-tree as the biggest representation of a pid-tree, no more nodes can be 



added to the tree without breaking the property about bounds. Pid-trees in Figures 1(b) and 
l(c)| are expanded pid-trees but pid-trees in Figures [1(^0] and [T(d)] 



are not. 



Proposition 5.1 LetM be a reachable marking of a t-netN. There is a unique expanded pid-tree 
in repr[M). 



5.2 Stripped pid-tree representation 

Now we will minimize the tree by removing all inactive pids. When a path labelled by K is 
added to the tree we potentially add all the subpids of K which may be inactive. Removing all 
inactive pids lead to stripped pid-trees. 

Definition 5.2 A stripped pid-tree is a pid-tree T such that for all path{n) G paths{T) we have 
K active. 

This second canonical form will lead to more symmetry detection. This may seem counter- 
intuitive since we remove data from the tree, but removing these data also removes constraints 
we add on pids. These constraints are introduced by inactive pids that do not appear in the 
marking and removing them lead to a better characterisation of the relations between pids. Be- 
cause inactive pids do not appear in the original marking, they should not impact on symmetry 
detection. Stripping a pid-tree will remove these pids. An example is given in Figure |4j 

Proposition 5.2 LetM be a reachable marking of a t-netN. There is a unique stripped pid-tree 
in repriM). 
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Even if removing inactive pids allows to provide a better characterisation of markings, some 
symmetries may still miss. An example is given in Figure 4(e) and |4(f)[ the trees are not 
equivalent but the corresponding states are, because pids 1.1 and 1.2 are inactive the relation 
between them can be released (they cannot be compared within the model) and thus 1.1.1 and 
1.2.1 can exchange roles. So we now introduce a sufficient condition to detect all symmetries 
with pid-trees. 




(d) Stripped version of T2 (e) A stripped pid-tree. (f) A stripped pid-tree. 



Figure 4: Two pid-trees which are not equivalent ( 4(a)[ |4(b)[ ) but become equivalent when 
transformed into stripped pid-trees ( |4(c) |4(d)[ ). The pids-trees in Figure |4(e) and |4(f)| are 
not equivalent but denote equivalent states. Most of markings have been omitted, black nodes 
represent inactive pids, gray nodes represent next pids. 



5.3 Sufficient condition for a unique representation 

To start with, we introduce the notion of clean marking. A marking is clean if for every active 
pid, or next-pid, its parent is active, except for the pid 1 "the father of all pids". This notion have 
several consequences. First, there will be a unique representation of a marking, i.e., repr{M) = 
{/?(M)}. The second consequence is that we will have: 

pidqM ^nextpidq^ =pid{R{M)) = subpid^^Unextpid^^ 

So the stripped and expanded representations of the marking are equal. 

Proposition 5.3 LetM he a reachable clean marking of a t-net. Then there is a unique pid-tree 
representation of M. 
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Now we show that the bijection used to check pid-tree equivalence is exactly the bijection 
used to check equivalence between markings. 

Proposition 5.4 Let Mi, Mj be two reachable markings in a t-net. If Mi,M2 are clean and 
Ml ~ M2 then there is a bijection h defined as 

h = {{K,K') I K €pid{R{Mi)), £ pid{R{M2)) and pafh{K , R{Mi)) = pafh{K' , R{M2))} 

such that Ml ~/, M2 ■ 

This proposition states that paths inside the pid-tree representation perfectly characterise 
the bijection between states corresponding to clean markings. It implies that if our pid-trees are 
equivalent then the bijection used to check pid-tree equivalence also satisfies the state equivalence 
requirements. 

Theorem 5.1 Let Mi and M2 be two reachable markings. If Mi and M2 are clean then R{Mi) ~ 
R{M2) iff Ml ~M2. 

When addressing clean markings we detect all symmetries in state spaces. It may seem a big 
constraint to work with clean markings, however it can be ensured by restriction at modelling 
level. One can for example force a process to wait its children to finish before dying; in these 
cases every reachable marking will be clean. 

Moreover, when addressing arbitrary systems, we can use, for example, stripped pid-trees to 
detect potential symmetries with the guarantee that for every discovered clean marking, we will 
detect all symmetries. 



6 Conclusion 

We have developed a theoretical package for symmetries detection in Petri nets with dynamic 
process creation. This approach is not complete however it leads to effective algorithms that 
can be implemented in a real- life model-checker. We have also shown its limits and discussed 
its application and the completeness of symmetry detection. This has been made using a tree 
data structure, the pid-trees, and an equivalence relation among them. The development of this 
package have been done in a general scope which results in potential symmetry detection. Then 
we have looked for canonisation with expanded and stripped pid-trees in order to get a unique 
representation of a marking. Finally we expressed a sufficient condition to detect all symmetries 
within a model. Using pid-trees leads to time efficient algorithms to detect symmetries during 
state space exploration, which is not the case when resorting to graph isomorphism computation 

uniiniE]. 

The different canonical forms we have introduced allow to implement the approach and 
detect symmetries. Even if all symmetries are not detected we believe that many of them are 
captured. We will try to confirm this statement in future works. We intend to implement our 
reductions by symmetries within Neco net compiler [4, 5j. It will lead to case studies and several 
benchmarks to validate our intuition about expected performances, however we will not be able 
to test every existing model. This is why the sufficient condition we have introduced in section 



5.3 is important. It guarantees that we will detect all symmetries in a class of models, moreover 
if the condition is not satisfied the method remains valid even if we lose this guarantee of full 
detection. 
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We will also explore possibilities given by the space between our two bounds, expanded and 
stripped pid-trees, using more precise labelling functions to capture symmetries, and address 



equivalent using a labelling function which takes in account the size, or contents, of markings. 

The trade-off between complexity of the computation and completeness was the main mo- 
tivation in this work. Indeed developing a complete approach with an exponential complexity 
in time would not be usable in a real life model-checker where the number of discovered states 
is almost always exponential in the size of the model. Even if reduction by symmetries may 
exponentially reduce the size of the state space, such an expensive approach will remain not 
usable due to the number of states as explained in the introduction. Thus partial methods like 
this one should be considered in practice even if they do not capture every symmetry. 
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